Spencer Parr Thesis Analysis

Info

Study Overview:
The primary objective of this study is to assess the impact of Clionid sponges on coral reef ecosystems in the U.S. Virgin Islands. Coral reefs are critical for biodiversity and marine life, but have faced significant declines due to environmental stressors, including climate change, ocean acidification, and disease outbreaks. Clionid sponges, as bioeroders, contribute to coral degradation by breaking down the calcium carbonate structure of corals, accelerating reef decline. By examining the prevalence and interactions of Cliona sponges with coral over time, this research aims to understand how these sponges influence reef health and resilience, particularly in response to environmental disturbances such as bleaching events and hurricanes.

This study utilizes long-term data collected by the Territorial Coral Reef Monitoring Program (TCRMP) from 2009 to 2023 across 34 reef sites in the USVI. The research focuses on three main questions: (1) how Cliona cover and coral interactions have changed over time, (2) how shifts in coral abundance affect Cliona-coral interactions, and (3) whether Cliona presence correlates with coral health, particularly disease prevalence. Through statistical modeling, including generalized linear models (GLMs) and linear mixed-effects models (LMEMs), the study will analyze the relationship between Cliona prevalence and environmental variables such as temperature, water quality, and coral species. This analysis is intended to reveal patterns of Cliona recruitment on stressed or diseased corals and to identify species-specific vulnerabilities, providing insights into the adaptive mechanisms of Cliona and guiding reef conservation strategies in light of changing ocean conditions.

Code Explanation:
The R scripts provided manage and manipulate TCRMP data, preparing it for statistical analysis. Key steps include filtering Cliona species data, calculating prevalence, and integrating environmental factors. Models such as Linear Mixed-Effects Models (LMEMs) and Generalized Linear Models (GLMs) are applied to assess Cliona recruitment patterns, coral interactions, and environmental influences.

Spencer Parr

Metadata

Coral Health Data:
Coral health data is the base structure of the the TCRMP data collection.

# Metadata table as a data frame
metadata <- data.frame(
  Columns = c(
    "Location", "SampleDate", "SampleYear", "SampleMonth", "Period", "SampleType", "Method", 
    "Recorder", "Transect", "SPP", "Size", "Interactions", "Predation", "Notes", "Damage", 
    "BLP", "BLII", "BL, P, VP, SP", "BLType", "BLCause", "CW", "CWLight", "CWDark", 
    "OldMort", "RecMort", "Disease", "NoLesion", "LesionPattern", "LesionEdge"
  ),
  Description = c(
    "Monitoring location at which data was collected",
    "Date on which data was collected",
    "Monitoring period in which data was collected; Note - Sample year may not always match the year from the SampleDate if a location was surveyed late in the annual monitoring period",
    "Month in which data was collected; Note - when a location was not surveyed fully in the same month, the month in which sampling began is used as the sample month",
    "Refers to the sample period in which the data falls; Annual, PeakBL, PostBL, SCTLD, WS, and Juvenile are the periods",
    "Indicates the type of transect (random or permanent) from which data was collected",
    "Indicates the way in which data was collected (e.g., line intercept, belt transect)",
    "Diver who recorded the data",
    "Transect number; Typically 6 transects per site, but additional transects may have been recorded",
    "Coral identified to species; If not identifiable to species, genus is used",
    "Coral colony size; Length (maximum planar diameter), Width (perpendicular to length), and Height (perpendicular to plane of growth)",
    "Refers to other benthic entities interacting with living coral tissue; Includes macroalgae, sponges, cyanobacteria, etc.",
    "Type and number of predators (e.g., damselfish, Coralliophila sp.)",
    "Additional notes about the colony condition",
    "Indicates the type of damage experienced by the colony (e.g., broken, overturned)",
    "Indicates the bleaching status of a colony; Used before percent bleaching and paling were recorded",
    "Indicates a secondary bleaching condition in addition to 'BLP'",
    "Represents the proportion of the colony's tissue affected by bleaching or paling; VP and SP indicate intermediary stages",
    "Refers to the type or location of bleaching on a colony (e.g., Focal, Medial, Granular)",
    "Notes the cause of observed bleaching (e.g., under Lobophora sp.)",
    "Represents the proportion of tissue at the light end of the Coral Watch colony color scale",
    "Lightest colony living tissue color observed (Coral Watch scale: 1-6)",
    "Darkest colony living tissue color observed (Coral Watch scale)",
    "Proportion of the colony skeletal structure considered old partial mortality",
    "Proportion of the colony skeletal structure considered recent partial mortality",
    "Presence of Caribbean coral diseases; Proportion of the colony affected by lesions",
    "Number of lesions present on the colony",
    "Describes the location of rapid tissue loss diseases (e.g., at edges or center of colony)",
    "Describes characteristics of the active lesion margin (e.g., smooth line, irregular pattern)"
  )
)

# Create the table
metadata %>%
  kbl(caption = "Metadata for Benthic Coral Health Data") %>%
  kable_styling(
    bootstrap_options = c("striped", "hover", "condensed"),
    full_width = FALSE,
    position = "left"
  )
Metadata for Benthic Coral Health Data
Columns Description
Location Monitoring location at which data was collected
SampleDate Date on which data was collected
SampleYear Monitoring period in which data was collected; Note - Sample year may not always match the year from the SampleDate if a location was surveyed late in the annual monitoring period
SampleMonth Month in which data was collected; Note - when a location was not surveyed fully in the same month, the month in which sampling began is used as the sample month
Period Refers to the sample period in which the data falls; Annual, PeakBL, PostBL, SCTLD, WS, and Juvenile are the periods
SampleType Indicates the type of transect (random or permanent) from which data was collected
Method Indicates the way in which data was collected (e.g., line intercept, belt transect)
Recorder Diver who recorded the data
Transect Transect number; Typically 6 transects per site, but additional transects may have been recorded
SPP Coral identified to species; If not identifiable to species, genus is used
Size Coral colony size; Length (maximum planar diameter), Width (perpendicular to length), and Height (perpendicular to plane of growth)
Interactions Refers to other benthic entities interacting with living coral tissue; Includes macroalgae, sponges, cyanobacteria, etc.
Predation Type and number of predators (e.g., damselfish, Coralliophila sp.)
Notes Additional notes about the colony condition
Damage Indicates the type of damage experienced by the colony (e.g., broken, overturned)
BLP Indicates the bleaching status of a colony; Used before percent bleaching and paling were recorded
BLII Indicates a secondary bleaching condition in addition to ‘BLP’
BL, P, VP, SP Represents the proportion of the colony’s tissue affected by bleaching or paling; VP and SP indicate intermediary stages
BLType Refers to the type or location of bleaching on a colony (e.g., Focal, Medial, Granular)
BLCause Notes the cause of observed bleaching (e.g., under Lobophora sp.)
CW Represents the proportion of tissue at the light end of the Coral Watch colony color scale
CWLight Lightest colony living tissue color observed (Coral Watch scale: 1-6)
CWDark Darkest colony living tissue color observed (Coral Watch scale)
OldMort Proportion of the colony skeletal structure considered old partial mortality
RecMort Proportion of the colony skeletal structure considered recent partial mortality
Disease Presence of Caribbean coral diseases; Proportion of the colony affected by lesions
NoLesion Number of lesions present on the colony
LesionPattern Describes the location of rapid tissue loss diseases (e.g., at edges or center of colony)
LesionEdge Describes characteristics of the active lesion margin (e.g., smooth line, irregular pattern)
# Notes as a data frame
notes <- data.frame(
  ID = 1:11,
  Note = c(
    "Transects are 10m in length with typically 6 per site, but additional transects may be present.",
    "Prior to 2007 only colonies with a max planar diameter greater than 10 cm were assessed.",
    "Prior to 2005, there were lapses in recording some variables.",
    "Coral interactions have been recorded from 2009 onward.",
    "Highlighted data has not been verified due to missing data sheets.",
    "Use of CW, CWLight, and CWDark was phased out in September 2015.",
    "Macroalgae previously identified as 'encrusting Lobophora sp.' was changed to Peyssonnelia sp in October 2015.",
    "Ramicrusta sp was officially added as a macroalgae interaction in 2017.",
    "Stony Coral Tissue Loss Disease (SCTLD) was first observed in the US Virgin Islands in January 2019.",
    "Additional columns (NoLesion, LesionPattern, LesionEdge) were added in February 2020.",
    "During the sample period 'Juvenile', specific transects were assessed for coral recruitment."
  )
)

# Create the notes table
notes %>%
  kbl(caption = "Additional Notes for Coral Health Metadata") %>%
  kable_styling(
    bootstrap_options = c("striped", "hover", "condensed"),
    full_width = FALSE,
    position = "left"
  )
Additional Notes for Coral Health Metadata
ID Note
1 Transects are 10m in length with typically 6 per site, but additional transects may be present.
2 Prior to 2007 only colonies with a max planar diameter greater than 10 cm were assessed.
3 Prior to 2005, there were lapses in recording some variables.
4 Coral interactions have been recorded from 2009 onward.
5 Highlighted data has not been verified due to missing data sheets.
6 Use of CW, CWLight, and CWDark was phased out in September 2015.
7 Macroalgae previously identified as ‘encrusting Lobophora sp.’ was changed to Peyssonnelia sp in October 2015.
8 Ramicrusta sp was officially added as a macroalgae interaction in 2017.
9 Stony Coral Tissue Loss Disease (SCTLD) was first observed in the US Virgin Islands in January 2019.
10 Additional columns (NoLesion, LesionPattern, LesionEdge) were added in February 2020.
11 During the sample period ‘Juvenile’, specific transects were assessed for coral recruitment.
site_metadata <- data.frame(
  Location = c(
    "Coral Bay", "Fish Bay", "Meri Shoal", "Black Point", "Botany Bay", "Brewers Bay",
    "Buck Island STT", "Coculus Rock", "College Shoal East", "Flat Cay", "Ginsburg Fringe",
    "Grammanik Tiger FSA", "Hind Bank East FSA", "Magens Bay", "Ruperts Rock", "Savana",
    "Seahorse Cottage Shoal", "South Capella", "South Water", "St James", "Buck Island STX",
    "Buck Island STX Deep", "Cane Bay", "Cane Bay Deep", "Castle", "Eagle Ray", "Great Pond",
    "Jacks Bay", "Kings Corner", "Lang Bank EEMP", "Lang Bank Red Hind FSA", 
    "Mutton Snapper FSA", "Salt River Deep", "Salt River West", "Sprat Hole"
  ),
  Code = c(
    "CRB", "FSB", "MSR", "BPT", "BTY", "BWR", "BIT", "CKR", "CSE", "FLC", "GBF", 
    "GKT", "HBE", "MGN", "RRK", "SVN", "SHR", "SCP", "SWT", "SSJ", "BIX", "BID",
    "CBS", "CBD", "CST", "EGR", "GRP", "JKB", "KGC", "LBP", "LBH", "MTS", "SRD", "SRW", "SPH"
  ),
  Island = c(
    "STJ", "STJ", "STJ", "STT", "STT", "STT", "STT", "STT", "STT", "STT", "STT", 
    "STT", "STT", "STT", "STT", "STT", "STT", "STT", "STT", "STT", "STX", "STX", 
    "STX", "STX", "STX", "STX", "STX", "STX", "STX", "STX", "STX", "STX", "STX", "STX", "STX"
  ),
  Latitude = c(
    18.33797, 18.31417, 18.24447, 18.34450, 18.35738, 18.34403, 18.27883, 18.31257, 18.18568,
    18.31822, 18.18770, 18.18885, 18.20217, 18.37425, 18.327433, 18.34064, 18.29467, 18.26267,
    18.28068, 18.29459, 17.78500, 17.80659, 17.77388, 17.77661, 17.76278, 17.76150, 17.71097,
    17.74337, 17.69116, 17.72145, 17.82427, 17.63660, 17.78523, 17.78530, 17.73400
  ),
  Longitude = c(
    -64.70402, -64.76408, -64.75862, -64.98595, -65.03442, -64.98435, -64.89833, -64.86058, 
    -65.07677, -64.99104, -64.95998, -64.95659, -65.00158, -64.93438, -64.925967, -65.08205, 
    -64.86750, -64.87237, -64.94592, -64.83238, -64.60917, -64.59935, -64.81350, -64.81522,
    -64.59743, -64.69880, -64.65221, -64.57160, -64.90008, -64.54706, -64.44963, -64.86240,
    -64.75917, -64.75940, -64.89540
  ),
  ReefComplex = c(
    "Nearshore", "Nearshore", "Mesophotic", "Nearshore", "Nearshore", "Nearshore",
    "Offshore", "Nearshore", "Mesophotic", "Offshore", "Mesophotic", "Mesophotic", 
    "Mesophotic", "Nearshore", "Nearshore", "Offshore", "Offshore", "Offshore", 
    "Offshore", "Offshore", "Offshore", "Mesophotic", "Nearshore", "Mesophotic", 
    "Offshore", "Offshore", "Nearshore", "Nearshore", "Nearshore", "Mesophotic", 
    "Mesophotic", "Offshore", "Mesophotic", "Nearshore", "Nearshore"
  ),
  Depth = c(
    9, 6, 30, 9, 8, 6, 14, 7, 30, 12, 63, 38, 39, 7, 5, 9, 20, 20, 20, 15, 15, 33, 
    10, 38, 7, 10, 6, 14, 17, 27, 33, 24, 30, 11, 8
  ),
  YearAdded = c(
    2011, 2002, 2005, 2003, 2002, 2002, 2005, 2002, 2003, 2003, 2011, 2003, 2003, 
    2002, 2022, 2003, 2003, 2003, 2005, 2005, 2002, 2017, 2002, 2009, 2003, 2002, 
    2003, 2002, 2007, 2009, 2002, 2003, 2009, 2002, 2002
  )
)

# Create a styled table
site_metadata %>%
  kbl(caption = "Survey Sites Metadata") %>%
  kable_styling(
    bootstrap_options = c("striped", "hover", "condensed"),
    full_width = TRUE,
    position = "left"
  )
Survey Sites Metadata
Location Code Island Latitude Longitude ReefComplex Depth YearAdded
Coral Bay CRB STJ 18.33797 -64.70402 Nearshore 9 2011
Fish Bay FSB STJ 18.31417 -64.76408 Nearshore 6 2002
Meri Shoal MSR STJ 18.24447 -64.75862 Mesophotic 30 2005
Black Point BPT STT 18.34450 -64.98595 Nearshore 9 2003
Botany Bay BTY STT 18.35738 -65.03442 Nearshore 8 2002
Brewers Bay BWR STT 18.34403 -64.98435 Nearshore 6 2002
Buck Island STT BIT STT 18.27883 -64.89833 Offshore 14 2005
Coculus Rock CKR STT 18.31257 -64.86058 Nearshore 7 2002
College Shoal East CSE STT 18.18568 -65.07677 Mesophotic 30 2003
Flat Cay FLC STT 18.31822 -64.99104 Offshore 12 2003
Ginsburg Fringe GBF STT 18.18770 -64.95998 Mesophotic 63 2011
Grammanik Tiger FSA GKT STT 18.18885 -64.95659 Mesophotic 38 2003
Hind Bank East FSA HBE STT 18.20217 -65.00158 Mesophotic 39 2003
Magens Bay MGN STT 18.37425 -64.93438 Nearshore 7 2002
Ruperts Rock RRK STT 18.32743 -64.92597 Nearshore 5 2022
Savana SVN STT 18.34064 -65.08205 Offshore 9 2003
Seahorse Cottage Shoal SHR STT 18.29467 -64.86750 Offshore 20 2003
South Capella SCP STT 18.26267 -64.87237 Offshore 20 2003
South Water SWT STT 18.28068 -64.94592 Offshore 20 2005
St James SSJ STT 18.29459 -64.83238 Offshore 15 2005
Buck Island STX BIX STX 17.78500 -64.60917 Offshore 15 2002
Buck Island STX Deep BID STX 17.80659 -64.59935 Mesophotic 33 2017
Cane Bay CBS STX 17.77388 -64.81350 Nearshore 10 2002
Cane Bay Deep CBD STX 17.77661 -64.81522 Mesophotic 38 2009
Castle CST STX 17.76278 -64.59743 Offshore 7 2003
Eagle Ray EGR STX 17.76150 -64.69880 Offshore 10 2002
Great Pond GRP STX 17.71097 -64.65221 Nearshore 6 2003
Jacks Bay JKB STX 17.74337 -64.57160 Nearshore 14 2002
Kings Corner KGC STX 17.69116 -64.90008 Nearshore 17 2007
Lang Bank EEMP LBP STX 17.72145 -64.54706 Mesophotic 27 2009
Lang Bank Red Hind FSA LBH STX 17.82427 -64.44963 Mesophotic 33 2002
Mutton Snapper FSA MTS STX 17.63660 -64.86240 Offshore 24 2003
Salt River Deep SRD STX 17.78523 -64.75917 Mesophotic 30 2009
Salt River West SRW STX 17.78530 -64.75940 Nearshore 11 2002
Sprat Hole SPH STX 17.73400 -64.89540 Nearshore 8 2002
# Metadata as a data frame
benthic_cover_metadata <- data.frame(
  Columns = c(
    "SampleYear", "SampleMonth", "Period", "Location", "FilmDate",
    "NoPts", "AnalysisBy", "AnalysisDate", "Transect", "BenthicCover", "Notes"
  ),
  Description = c(
    "Monitoring period in which data was collected; Sample year may not always match the year from the FilmDate if a location was surveyed late in the annual monitoring period.",
    "Month in which data was collected; when a location was not surveyed fully in the same month, the month in which sampling began is used.",
    "Refers to the sample period in which the data falls; Periods include Annual, PeakBL, PostBL, SCTLD, and WS.",
    "Monitoring location at which data was collected.",
    "Date on which transects were filmed.",
    "Total number of random assignment points analyzed for the transect.",
    "Individual who performed the benthic cover analysis.",
    "Date on which benthic cover analysis was done.",
    "Transect number; typically 6 transects per site, but additional transects may have been filmed.",
    "All remaining columns refer to individual benthic cover categories. Values reported in 'BenthicData2017-PresRAW' are point totals, while values in 'BenthicData' are calculated benthic cover.",
    "Additional notes pertaining to benthic cover."
  )
)

# Create the table
benthic_cover_metadata %>%
  kbl(caption = "TCRMP Benthic Cover Metadata") %>%
  kable_styling(
    bootstrap_options = c("striped", "hover", "condensed"),
    full_width = FALSE,
    position = "left"
  )
TCRMP Benthic Cover Metadata
Columns Description
SampleYear Monitoring period in which data was collected; Sample year may not always match the year from the FilmDate if a location was surveyed late in the annual monitoring period.
SampleMonth Month in which data was collected; when a location was not surveyed fully in the same month, the month in which sampling began is used.
Period Refers to the sample period in which the data falls; Periods include Annual, PeakBL, PostBL, SCTLD, and WS.
Location Monitoring location at which data was collected.
FilmDate Date on which transects were filmed.
NoPts Total number of random assignment points analyzed for the transect.
AnalysisBy Individual who performed the benthic cover analysis.
AnalysisDate Date on which benthic cover analysis was done.
Transect Transect number; typically 6 transects per site, but additional transects may have been filmed.
BenthicCover All remaining columns refer to individual benthic cover categories. Values reported in ‘BenthicData2017-PresRAW’ are point totals, while values in ‘BenthicData’ are calculated benthic cover.
Notes Additional notes pertaining to benthic cover.
# Notes as a data frame
benthic_cover_notes <- data.frame(
  ID = 1:9,
  Note = c(
    "Benthic transects are the same ones on which coral health metrics are recorded. Each is 10m in length, typically 6 per site, though additional transects may have been recorded.",
    "Coral Point Count with Excel Extensions (CPCe) software was used for analysis through 2018. After 2018, random point assignment was done using R Studio.",
    "The number of points per image has varied throughout the monitoring program: 10 (2001-2011), 15 (2012-2013), and 20 (2014-present).",
    "Rhodolith (RO) was added as a benthic cover category in 2011.",
    "Peyssonnelia sp (PEY) was added as a benthic cover category in 2014.",
    "Ramicrusta sp (RAMI) was added as a benthic cover category in 2017.",
    "Data in red indicates values were unable to be verified during QAQC because the original analysis file was missing or corrupted.",
    "Stony Coral Tissue Loss Disease (SCTLD) was added as a coral condition category after its arrival in the US Virgin Islands in 2019.",
    "Cells containing 'ND' indicate that there is no data recorded for that benthic category."
  )
)

# Create the notes table
benthic_cover_notes %>%
  kbl(caption = "Additional Notes for TCRMP Benthic Cover Metadata") %>%
  kable_styling(
    bootstrap_options = c("striped", "hover", "condensed"),
    full_width = FALSE,
    position = "left"
  )
Additional Notes for TCRMP Benthic Cover Metadata
ID Note
1 Benthic transects are the same ones on which coral health metrics are recorded. Each is 10m in length, typically 6 per site, though additional transects may have been recorded.
2 Coral Point Count with Excel Extensions (CPCe) software was used for analysis through 2018. After 2018, random point assignment was done using R Studio.
3 The number of points per image has varied throughout the monitoring program: 10 (2001-2011), 15 (2012-2013), and 20 (2014-present).
4 Rhodolith (RO) was added as a benthic cover category in 2011.
5 Peyssonnelia sp (PEY) was added as a benthic cover category in 2014.
6 Ramicrusta sp (RAMI) was added as a benthic cover category in 2017.
7 Data in red indicates values were unable to be verified during QAQC because the original analysis file was missing or corrupted.
8 Stony Coral Tissue Loss Disease (SCTLD) was added as a coral condition category after its arrival in the US Virgin Islands in 2019.
9 Cells containing ‘ND’ indicate that there is no data recorded for that benthic category.

Cliona Prevalence- Coral Health Data

This code explains the data management/manipulation and graphical evidence of Cliona prevalence for each TCRMP location

  1. The data is read into R using read_csv, and then selected columns are extracted, specifically: Period, SampleType, Location, SampleYear, Method, Transect, SPP, Cliona, Spo1, Spo1ID, Spo2ID, and Spo2. This selection reduces the dataset to only the columns relevant to this analysis.

  2. Next, a new column called presence is created to indicate whether there is evidence of Cliona presence on each sample. The mutate function is used along with if_else to assign a value of 1 to presence if any of the following conditions are met: the Cliona value is greater than zero, Spo1ID or Spo2ID columns contain either “CLSP” or “BOSP”. If none of these conditions are met, presence is set to 0. After this, presence values are checked for any missing data, and replace_na is used to fill any NA values with 0, ensuring that presence is either 1 or 0 without any missing values.

  3. The code then applies several filters to refine the dataset. First, it filters for rows where Period is either “Annual” or “PostBL”, SampleType is either “Permanent” or “permanent”, and Method is either “intercept” or “50 cm belt”. This step ensures that only specific types of samples are kept for the analysis. Further, rows where Transect contains the letter “A” are removed, ensuring that only certain transect data is included. Finally, the code filters for rows where SampleYear is 2005 or later, focusing on more recent data.

  4. In the last step, additional transformations are applied. A new ID column is created to provide a unique identifier for each row, generated as a sequence from 1 to the number of rows in the filtered dataset. The SampleYear, Location, and Transect columns are converted to factor data types using as.factor to prepare them for categorical analysis. This finalizes the dataset with all necessary transformations, making it ready for further analysis or modeling.

#1
cliona <- read_csv("../Cliona.csv")%>%
  select(Period,SampleType, Location, SampleYear, Method, Transect, SPP, Cliona, Spo1, Spo1ID, Spo2ID, Spo2)%>%
  #2
  mutate(presence=if_else(Cliona>0 | Spo1ID %in% c("CLSP", "BOSP") | Spo2ID %in% c("CLSP", "BOSP"),1,0))%>%
  mutate(presence=replace_na(presence,0))%>%
  #3
  filter(Period%in% c("Annual","PostBL")&SampleType%in% c("Permanent","permanent")&Method%in% c("intercept","50 cm belt"))%>%
  filter(!grepl("A",Transect))%>%
  filter(SampleYear>=2005)%>%
  #4
  mutate(ID=1:nrow(.),
         SampleYear=as.factor(SampleYear),
         Location=as.factor(Location),
         Transect=as.factor(Transect))
  1. This R code calculates Cliona sponge prevalence across transects by grouping the cliona dataset by Location, SampleYear, and Transect. Within each group, it calculates three metrics: prev (the average presence of Cliona, indicating prevalence), freq (total occurrences of Cliona), and total (total observations in that group). After ungrouping the data, it adds a unique ID to each row. The resulting transectprev dataset provides summarized prevalence data for each unique combination of location, year, and transect.
transectprev <- cliona %>%
  group_by(Location, SampleYear, Transect) %>%
  summarise(prev = mean(presence), freq = sum(presence), total = n())%>%
  ungroup()%>%
  mutate(ID= 1:nrow(.))
  1. This code creates a summary data frame, location_means, showing the average Cliona sponge prevalence for each location. It groups the transectprev data by Location and then calculates the mean of prev (prevalence) within each group, saving the result as mean_prev. The resulting location_means data frame has each location’s name and its corresponding average Cliona prevalence.
location_means <- transectprev %>%
  group_by(Location) %>%
  summarise(mean_prev = mean(prev))
  1. This code generates a plot that visualizes the prevalence of Cliona sponges at different reef sites by creating a boxplot for each location. The ggplot function uses the transectprev dataset, which contains Cliona prevalence data, and arranges the locations on the x-axis in descending order of prevalence. Each boxplot represents the range and distribution of Cliona prevalence values at that location. To enhance the visualization, the code includes a stat_summary layer that adds a green diamond marker at the mean prevalence for each location, making it easy to compare average Cliona prevalence across sites.

    ggplot(transectprev, aes(x = reorder(Location, -prev), y = prev)) +
      geom_boxplot(fill = "lightblue") +
      stat_summary(fun = mean, geom = "point", shape = 18, size = 3, color = "darkgreen") +
      labs(title = "Prevalence of Cliona Sponges by Location (with Mean Prevalence Marked)",
           x = "Location",
           y = "Prevalence") +
      theme_minimal() +
      theme(axis.text.x = element_text(angle = 45, hjust = 1))

This plot is important because it reveals the variability in Cliona prevalence across different reef sites. Locations such as Salt River West, Brewers Bay, and Savana show significantly higher prevalence levels, as indicated by both the wider range and higher position of the boxplots and mean markers, compared to other sites with much lower prevalence. This variability suggests that certain sites may be more susceptible to Cliona sponge colonization, potentially due to environmental conditions or other factors specific to those areas.

  1. This code produces a set of line graphs showing the prevalence of Cliona sponges over time, broken down by transect and location. Each location has its own panel (using facet_wrap), and within each panel, different transects are represented by distinct lines and colors. The x-axis represents the sample year, while the y-axis shows the prevalence of Cliona sponges. The lines connect the prevalence data points over time for each transect, while points mark each specific measurement. The scale_x_continuous function formats the x-axis labels to show each unique year in the dataset, improving clarity.
# Convert SampleYear to numeric, if appropriate
# transectprev$SampleYear <- as.numeric(transectprev$SampleYear)
transectprevnum<-transectprev
transectprevnum$SampleYear <- as.numeric(transectprevnum$SampleYear)
# Then plot with the updated variable
ggplot(transectprevnum, aes(x = SampleYear, y = prev, color = factor(Transect), group = Transect)) +
  geom_line() +
  geom_point() +
  facet_wrap(~Location, scales = "free_y") +
  labs(title = "Prevalence Over Years by Transect",
       x = "Year",
       y = "Prevalence",
       color = "Transect") +
  scale_x_continuous(breaks = unique(transectprevnum$SampleYear), labels = unique(transectprevnum$SampleYear)) +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 45, hjust = 1),
        plot.title = element_text(hjust = 0.5))

This graph is valuable because it allows for temporal analysis of Cliona prevalence trends at multiple scales: across years, transects, and locations. By categegorising between the data by transect and location, the plot can reveal patterns, such as consistent increases or decreases in prevalence, or fluctuations in response to specific events or environmental changes. It could also be used to identify which transects or locations show higher susceptibility to Cliona over time.

Identifying Environmental Event Impact on Cliona Prevalence

  1. Add binary or categorical variables to indicate these events
envvariable <- transectprev %>%
  mutate(Bleaching = if_else(SampleYear %in% c(2010, 2023), 1, 0),
         Hurricane = if_else(SampleYear %in% c(2017, 2019), 1, 0),
         SCTLD = if_else(SampleYear == 2019, 1, 0))

In order to run a linear mixed effects model, i need the average size of corals grouped by site and year. But 2003-2005 does not have any measurements in the width column. meaning i must cut 2003-2005 out of this section of analysis.

clionawidth<-raw_clio
clionawidth <- clionawidth[!is.na(clionawidth$Width), ]
coral_size_summary <- clionawidth %>%
  group_by(Location, SampleYear, Transect) %>%
  summarise(mean_volume = mean(Length * Width * Height, na.rm = TRUE),
            mean_surface_area = mean(2 * (Length * Width + Width * Height + Height * Length), na.rm = TRUE),
            .groups = "drop")
   
coral_size_summary <- coral_size_summary %>%
  filter(!grepl("[A-Za-z]", Transect))

coral_size_summary <- coral_size_summary %>%
  filter(!Transect %in% c("7", "8"))

coral_size_summary <- coral_size_summary %>%
  mutate(SampleYear = as.factor(SampleYear))

envvariable<- envvariable %>%
  mutate(SampleYear = as.factor(SampleYear))

envvariable <- envvariable %>%
  left_join(coral_size_summary, by = c("Location", "SampleYear", "Transect"))
  1. Running a linear mixed-effects model
#You're gonna want to add temperature to the model. but that is gonna take a second
envvariable <- envvariable %>%
  mutate(mean_volume_scaled = scale(mean_volume))


lmem <- lmer(prev ~ Bleaching + Hurricane + SCTLD + mean_volume_scaled +
               (1 | Location) + (1 | SampleYear), 
             data = envvariable)
summary(lmem)
## Linear mixed model fit by REML ['lmerMod']
## Formula: prev ~ Bleaching + Hurricane + SCTLD + mean_volume_scaled + (1 |  
##     Location) + (1 | SampleYear)
##    Data: envvariable
## 
## REML criterion at convergence: -7625.6
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -3.3206 -0.4851 -0.2305  0.1790  9.9318 
## 
## Random effects:
##  Groups     Name        Variance  Std.Dev.
##  Location   (Intercept) 0.0005761 0.02400 
##  SampleYear (Intercept) 0.0001662 0.01289 
##  Residual               0.0035541 0.05962 
## Number of obs: 2778, groups:  Location, 33; SampleYear, 17
## 
## Fixed effects:
##                     Estimate Std. Error t value
## (Intercept)         0.032443   0.005580   5.814
## Bleaching           0.015383   0.014223   1.082
## Hurricane           0.003857   0.014090   0.274
## SCTLD              -0.013429   0.019217  -0.699
## mean_volume_scaled  0.004516   0.001212   3.727
## 
## Correlation of Fixed Effects:
##             (Intr) Blchng Hurrcn SCTLD 
## Bleaching   -0.171                     
## Hurricane   -0.174  0.068              
## SCTLD        0.000  0.000 -0.683       
## mn_vlm_scld -0.012  0.006  0.011  0.002
# Extract rows used in the model
model_data <- model.frame(lmem)

# Generate predictions
envvariable$predicted <- NA # Initialize a column for predictions
envvariable$predicted[as.numeric(rownames(model_data))] <- predict(lmem)


ggplot(envvariable, aes(x = predicted, y = prev)) +
  geom_point(alpha = 0.5) +
  geom_abline(slope = 1, intercept = 0, color = "red", linetype = "dashed") +
  labs(title = "Predicted vs Observed Cliona Prevalence",
       x = "Predicted Prevalence",
       y = "Observed Prevalence") +
  theme_minimal()

The results of the linear mixed-effects model indicate that coral size, represented by the rescaled variable mean_volume_scaled, is a significant predictor of Cliona sponge prevalence. Specifically, for every one standard deviation increase in coral volume, Cliona prevalence increases by approximately 0.0044 units. This finding suggests that larger corals are more likely to host Cliona sponges, potentially due to increased surface area or habitat availability. This relationship was statistically significant, with a t-value of 3.635, highlighting coral size as an important factor influencing Cliona colonization patterns.

In contrast, environmental events such as bleaching, hurricanes, and SCTLD did not show significant effects on Cliona prevalence in this model. While bleaching showed a slight positive association and SCTLD a small negative association, neither effect was strong enough to be statistically significant. This suggests that, at least within this dataset, these environmental disturbances may not be major drivers of Cliona sponge colonization patterns. Additionally, the random effects for location and year showed minimal variability, indicating that the fixed effects in the model largely explain the observed differences in Cliona prevalence.

The rescaling of coral size improved the model’s stability and interpretability, as evidenced by a lower restricted maximum likelihood (REML) criterion compared to the original model. The scaled variable allowed for a more meaningful comparison of effect sizes among predictors, showing that coral size is the primary driver of Cliona prevalence in this analysis. These results highlight the importance of considering coral morphology in understanding sponge-coral interactions, while also suggesting that environmental disturbances may have less influence than previously hypothesized. Future analyses could refine these findings by incorporating additional environmental or site-specific variables

  1. Running a Generalized Linear Model
# Run a Generalized Linear Model (GLM)
glm_model <- glm(prev ~ Bleaching + Hurricane + SCTLD + mean_volume_scaled,
                 data = envvariable, family = quasibinomial)
summary(glm_model)
## 
## Call:
## glm(formula = prev ~ Bleaching + Hurricane + SCTLD + mean_volume_scaled, 
##     family = quasibinomial, data = envvariable)
## 
## Coefficients:
##                    Estimate Std. Error t value Pr(>|t|)    
## (Intercept)        -3.32425    0.04043 -82.227  < 2e-16 ***
## Bleaching           0.34027    0.13519   2.517   0.0119 *  
## Hurricane           0.05151    0.14105   0.365   0.7150    
## SCTLD              -0.49597    0.21618  -2.294   0.0219 *  
## mean_volume_scaled  0.09563    0.01871   5.111 3.41e-07 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for quasibinomial family taken to be 0.1220545)
## 
##     Null deviance: 265.71  on 2777  degrees of freedom
## Residual deviance: 261.60  on 2773  degrees of freedom
##   (42 observations deleted due to missingness)
## AIC: NA
## 
## Number of Fisher Scoring iterations: 6
# Extract rows used in the model
model_data <- model.frame(glm_model)

# Generate predicted probabilities
predicted_values <- predict(glm_model, type = "response")

# Add a predicted column with NA for unmatched rows
envvariable$predicted <- NA
envvariable$predicted[as.numeric(rownames(model_data))] <- predicted_values


# Plot predicted vs observed prevalence
ggplot(envvariable, aes(x = predicted, y = prev)) +
  geom_point(alpha = 0.5) +
  geom_abline(slope = 1, intercept = 0, color = "red", linetype = "dashed") +
  labs(title = "Predicted vs Observed Cliona Prevalence",
       x = "Predicted Prevalence",
       y = "Observed Prevalence") +
  theme_minimal()

# Effect of Bleaching
ggplot(envvariable, aes(x = factor(Bleaching), y = predicted)) +
  geom_boxplot() +
  labs(title = "Effect of Bleaching on Predicted Cliona Prevalence",
       x = "Bleaching Event (0 = No, 1 = Yes)",
       y = "Predicted Prevalence") +
  theme_minimal()

Spencer Parr

Cliona Prevalence by Reef Complex

This code shows the process of cleaning, graphing, and analyzing the coral health data in order to find if there is any significant difference in Cliona prevalence between Nearshore, Offshore, and Mesophotic sites.

  1. revert back to Coral Prevalence- Coral Health Data tab, steps 1-5 are identical

    #1
    cliona <- read_csv("../Cliona.csv")%>%
      select(Period,SampleType, Location, SampleYear, Method, Transect, SPP, Cliona, Spo1, Spo1ID, Spo2ID, Spo2)%>%
      #2
      mutate(presence=if_else(Cliona>0 | Spo1ID %in% c("CLSP", "BOSP") | Spo2ID %in% c("CLSP", "BOSP"),1,0))%>%
      mutate(presence=replace_na(presence,0))%>%
      #3
      filter(Period%in% c("Annual","PostBL")&SampleType%in% c("Permanent","permanent")&Method%in% c("intercept","50 cm belt"))%>%
      filter(!grepl("A",Transect))%>%
      filter(SampleYear>=2005)%>%
      #4
      mutate(ID=1:nrow(.),
             SampleYear=as.factor(SampleYear),
             Location=as.factor(Location),
             Transect=as.factor(Transect))
    #5
    transectprev <- cliona %>%
      group_by(Location, SampleYear, Transect) %>%
      summarise(prev = mean(presence), freq = sum(presence), total = n())%>%
      ungroup()%>%
      mutate(ID= 1:nrow(.))
  2. Read in the TCRMP site metadata csv. This data frame has 8 columns: Location (e.g., Coral Bay), Code (e.g., CRB), Island (e.g., STJ), Latitude (e.g., 18.338), Longitude (e.g., -64.704), ReefComplex (e.g., Nearshore), Depth (e.g., 9), and YearAdded (e.g., 2011)

    reef_complex <- read_csv("../SiteMetadata.csv")
  3. Next I merge the transectprev data frame with the reef_complex data frame, adding reef complex information (e.g., Nearshore, Offshore, Mesophotic) to the corresponding locations in transectprev based on the Location column.

    Reef_Types <- transectprev %>%
      left_join(reef_complex, by = "Location")
  4. This code groups the Reef_Types data frame by Location and ReefComplex, calculates the mean prevalence (prev) for each group while ignoring missing values, and then converts ReefComplex into a factor with the levels ordered as “Nearshore,” “Offshore,” and “Mesophotic.”

    reef_sum <- Reef_Types %>%
      group_by(Location, ReefComplex) %>%
      summarise(
        mean_prevalence = mean(prev, na.rm = TRUE)  # Calculate mean prevalence
      )%>%
      mutate( ReefComplex = factor(ReefComplex, levels = c("Nearshore", "Offshore", "Mesophotic")))
  5. This code generates a boxplot to visualize the mean prevalence of Cliona sponges across different reef complexes using the reef_sum dataset. It maps ReefComplex to the x-axis, mean_prevalence to the y-axis, and uses colors to distinguish reef complexes.

    ggplot(reef_sum, aes(x = ReefComplex, y = mean_prevalence, fill = ReefComplex)) +
      geom_boxplot(outlier.color = "red", outlier.shape = 16) +
      geom_jitter(width = 0.2, size = 2, alpha = 0.6) +  # Add jitter for data points
      labs(
        title = "Cliona Sponge Prevalence Across Reef Complexes",
        x = "Reef Complex",
        y = "Mean Prevalence"
      ) +
      theme_minimal() +
      theme(
        legend.position = "none",
        axis.text.x = element_text(angle = 45, hjust = 1)
      )

  6. Next we run a one-way Analysis of Variance (ANOVA) to test whether there is a statistically significant difference in mean Cliona sponge prevalence across different ReefComplex categories (Nearshore, Offshore, and Mesophotic).

    # Run ANOVA
    anova_result <- aov(mean_prevalence ~ ReefComplex, data = reef_sum)

    The ANOVA results indicate a statistically significant difference in the mean prevalence of Cliona sponge coverage across the three reef complexes (Nearshore, Offshore, and Mesophotic). The ReefComplex factor accounts for a Sum of Squares (Sum Sq) of 0.005884, while the residual variation (unexplained variance) accounts for 0.013323. The Mean Square (Mean Sq) for ReefComplex is 0.0029420, and the F-statistic is 6.625, with an associated p-value (Pr(>F)) of 0.00414. This p-value is below the conventional threshold of 0.05, suggesting that the differences in Cliona prevalence among the reef complexes are statistically significant.

    # Create a ANOVA table
    broom::tidy(anova_result) %>%
      gt() %>%
      tab_header(
        title = "ANOVA Results for Cliona Sponge Prevalence Across Reef Complexes"
      )
    ANOVA Results for Cliona Sponge Prevalence Across Reef Complexes
    term df sumsq meansq statistic p.value
    ReefComplex 2 0.005876101 0.002938051 6.605047 0.004197984
    Residuals 30 0.013344571 0.000444819 NA NA
  7. Next step is the TukeyHSD (Tukey’s Honest Significant Differences) test which is a post-hoc analysis used to identify which specific group pairs have significant differences in their means. It adjusts for multiple comparisons to control the family-wise error rate, ensuring that conclusions about group differences are not overstated.

    tukey_result <- TukeyHSD(anova_result)

    The TukeyHSD test shows a significant difference in Cliona sponge prevalence between Nearshore and Mesophotic reefs, with Nearshore having a higher mean prevalence (difference = 0.0331, p = 0.003). No significant differences were found between Offshore and Mesophotic reefs (p = 0.207) or Offshore and Nearshore reefs (p = 0.151). These results indicate that Nearshore reefs have notably higher prevalence compared to Mesophotic reefs, while other comparisons show no meaningful differences.

    # Create a TukeyHSD table
    broom::tidy(tukey_result) %>%
      gt() %>%
      tab_header(
        title = "TukeyHSD Results for Cliona Sponge Prevalence Across Reef Complexes"
      )
    TukeyHSD Results for Cliona Sponge Prevalence Across Reef Complexes
    term contrast null.value estimate conf.low conf.high adj.p.value
    ReefComplex Offshore-Nearshore 0 -0.01635689 -0.03765759 0.004943818 0.158150374
    ReefComplex Mesophotic-Nearshore 0 -0.03307402 -0.05562030 -0.010527744 0.003020255
    ReefComplex Mesophotic-Offshore 0 -0.01671713 -0.04008687 0.006652605 0.199058088

    We still need to check for normality of residuals and homogeneity of variances. For normality a Shapiro-Wilk statistic will be ran, and for homogeneity of variances an Levene’s test will be ran.

  8. The Shapiro-Wilk test produced a W statistic of 0.965 and a p-value of 0.3555. Since the p-value is greater than the conventional threshold of 0.05, we fail to reject the null hypothesis that the residuals are normally distributed. This indicates that the assumption of normality for the ANOVA model’s residuals is satisfied, supporting the validity of the ANOVA results.

    shapiro<-shapiro.test(residuals(anova_result))
    
    # Create a shapiro table
    broom::tidy(shapiro) %>%
      gt() %>%
      tab_header(
        title = "Shapiro-Wilk Results for Cliona Sponge Prevalence Across Reef Complexes"
      )
    Shapiro-Wilk Results for Cliona Sponge Prevalence Across Reef Complexes
    statistic p.value method
    0.9635491 0.3246365 Shapiro-Wilk normality test
  9. The Q-Q plot shows that the residuals align closely with the red reference line, indicating that they are approximately normally distributed. The histogram demonstrates that the residuals are symmetrically distributed around zero. And the density plot compares the residuals’ distribution (blue line) to a theoretical normal curve (red dashed line), showing strong alignment with minor deviations. These plots confirm that the residuals satisfy the normality assumption required for ANOVA, supporting the validity of the statistical analysis.

    # Generate individual plots
    qq_plot <- ggplot(data.frame(residuals = residuals(anova_result)), aes(sample = residuals)) +
      stat_qq() + stat_qq_line(color = "red") + 
      labs(title = "Q-Q Plot of Residuals")
    
    hist_plot <- ggplot(data.frame(residuals = residuals(anova_result)), aes(x = residuals)) +
      geom_histogram(binwidth = 0.005, fill = "lightblue", color = "black") + 
      labs(title = "Histogram of Residuals", x = "Residuals", y = "Frequency") +
      theme_minimal()
    
    density_plot <- ggplot(data.frame(residuals = residuals(anova_result)), aes(x = residuals)) +
      geom_density(color = "blue", fill = "lightblue", alpha = 0.5) +
      stat_function(fun = dnorm, args = list(mean = mean(residuals(anova_result)), 
                                             sd = sd(residuals(anova_result))),
                    color = "red", linetype = "dashed") +
      labs(title = "Density Plot with Normal Curve", x = "Residuals", y = "Density")
    
    # Arrange plots
    grid.arrange(qq_plot, hist_plot, density_plot, nrow = 1)

  10. The Levene’s test results show an F-value of 2.3661 with a p-value of 0.1111. Since the p-value is greater than the threshold of 0.05, we fail to reject the null hypothesis that the group variances are equal. This indicates that the assumption of homogeneity of variance is satisfied, supporting the appropriateness of using ANOVA for this data.

    levene<-leveneTest(mean_prevalence ~ ReefComplex, data = reef_sum)
    
    # Create a Levene table
    broom::tidy(levene) %>%
      gt() %>%
      tab_header(
        title = "Levene's Test Results for Cliona Sponge Prevalence Across Reef Complexes"
      )
    Levene's Test Results for Cliona Sponge Prevalence Across Reef Complexes
    statistic p.value df df.residual
    2.550193 0.09487255 2 30
  11. Finally, we need to visualize the original mean prevalence between reef complexes but now with TukeyHSD multcompletters. First, the TukeyHSD function is applied to the ANOVA model to compute pairwise comparisons of means for the “ReefComplex” variable, identifying which groups differ significantly. The multcompView::multcompLetters function then assigns grouping letters (e.g., “a”, “b”) based on adjusted p-values, indicating which groups are statistically similar or different. These labels are converted into a data frame, with the “ReefComplex” levels and their respective group letters for easy handling. Finally, the left_join function integrates the Tukey test results (group letters) into the reef_sum dataset.

    ####visualise the resutls#######
    
    # Extract Tukey test results
    tukey_result <- TukeyHSD(anova_result, "ReefComplex")
    
    # Convert Tukey results to group letters
    library(multcompView)
    tukey_labels <- multcompLetters(tukey_result$ReefComplex[, "p adj"])$Letters
    
    # Convert to a data frame for merging
    tukey_labels <- data.frame(
      ReefComplex = names(tukey_labels),
      group = tukey_labels
    )
    
    # Merge tukey_labels with your reef_sum data
    reef_sum <- reef_sum %>%
      left_join(tukey_labels, by = "ReefComplex")
  12. This code generates a boxplot comparing mean Cliona sponge prevalence across three reef complexes: Mesophotic, Nearshore, and Offshore, with statistical group labels from Tukey’s post-hoc test. The boxplot shows the median prevalence (horizontal line), spread (interquartile range and whiskers), and outliers (red dots). Statistical group labels (“a”, “b”, “ab”) indicate significant differences: Mesophotic reefs have significantly lower prevalence than Nearshore (p = 0.003), while Offshore overlaps with both. Nearshore reefs have the highest prevalence and variability, Mesophotic the lowest with narrow spread, and Offshore shows intermediate prevalence. This visualization highlights depth-related differences in sponge prevalence, suggesting ecological patterns tied to reef complex characteristics.

    # Plot with Tukey labels
    ggplot(reef_sum, aes(x = ReefComplex, y = mean_prevalence, fill = ReefComplex)) +
      geom_boxplot(outlier.color = "red", outlier.shape = 16) +
      geom_jitter(width = 0.2, size = 2, alpha = 0.6) +
      geom_text(
        aes(label = group, x = ReefComplex, y = max(mean_prevalence) + 0.01),
        data = reef_sum, inherit.aes = FALSE, color = "black", size = 4, vjust = -0.5
      ) +
      labs(
        title = "Cliona Sponge Prevalence Across Reef Complexes",
        x = "Reef Complex",
        y = "Mean Prevalence"
      ) +
      theme_minimal() +
      theme(
        legend.position = "none",
        axis.text.x = element_text(angle = 45, hjust = 1)
      )

Cliona Extent

This code explains the data management/manipulation and graphical evidence of Cliona extent for each TCRMP location

  1. The data is read into R using read_csv, and then selected columns are extracted, specifically: Period, SampleType, Location, SampleYear, Method, Transect, SPP, Cliona, Spo1, Spo1ID, Spo2ID, and Spo2. This selection reduces the dataset to only the columns relevant to this analysis.

  2. the code generates two new columns, CLSP and BOSP, which indicate coverage values for two types of sponges. Using the mutate and case_when functions, it examines the columns Spo1ID and Spo2ID to populate these new columns. If Spo1ID contains “CLSP”, then the value of Spo1 is assigned to the CLSP column. Similarly, if Spo2ID contains “CLSP”, then the value from Spo2 is assigned to CLSP. The same process is used to populate the BOSP column, using “BOSP” as the indicator. If neither condition is met in either case, the value is set to NA.

  3. The code then filters the data based on several criteria. First, it keeps only rows where Period is either “Annual” or “PostBL” and SampleType is either “Permanent” or “permanent”, ensuring consistency in sample type naming. Also, only rows where Method is either “intercept” or “50 cm belt” are retained, narrowing the dataset to specific sampling methods. And rows with certain letters in the Transect column (“A”, “R”, or “L”) are excluded using a filtering condition with grepl, and only data from 2005 onward (SampleYear >= 2005) is kept.

  4. Finally, the data is reshaped from a wide format to a long format using pivot_longer. This transformation takes the columns CLSP, BOSP, and Cliona and combines them into two new columns: SpongeType and Extent. The SpongeType column identifies which type of sponge coverage is being recorded (either CLSP, BOSP, or Cliona), and Extent contains the corresponding value of coverage for each type.

#1. 
clionaext <- read_csv("../Cliona.csv")%>%
  select(Period,SampleType, Location, SampleYear, Method, Transect, SPP, Cliona, Spo1, Spo1ID, Spo2ID, Spo2)%>%
  
#2.
  mutate( CLSP = case_when( Spo1ID == "CLSP" ~ Spo1, Spo2ID == "CLSP" ~ Spo2, TRUE ~ NA_real_),
    BOSP = case_when( Spo1ID == "BOSP" ~ Spo1, Spo2ID == "BOSP" ~ Spo2,
      TRUE ~ NA_real_)) %>%
  select(-Spo1, -Spo1ID, -Spo2, -Spo2ID)%>%
  
#3.
  filter(Period%in% c("Annual","PostBL")&SampleType%in% c("Permanent","permanent")&Method%in% c("intercept","50 cm belt"))%>%
  filter(!grepl("A", "R", "L",Transect))%>%
  filter(SampleYear>=2005)%>%
  mutate(ID=1:nrow(.),
         SampleYear=as.factor(SampleYear),
         Location=as.factor(Location),
         Transect=as.factor(Transect))%>%
  
#4. 
  pivot_longer(cols = c(CLSP, BOSP, Cliona), 
               names_to = "SpongeType", 
               values_to = "Extent")

For the first analysis, I ask the question: Has the extent of Cliona sponge coverage increased or decreased over time? Is there a pattern in prevalence by year?

  1. This code assigns the clionaext data frame to a new data frame named ext_time. It then updates ext_time by converting the SampleYear column from a factor to a numeric data type, ensuring that the year values are treated as continuous numbers rather than categorical labels.

    ext_time<-clionaext
    
    ext_time <- ext_time %>%
      mutate(SampleYear = as.numeric(as.character(SampleYear)))
  2. This code processes the clionaext data frame to calculate the total number of unique corals recorded in each year. First, it groups the data by SampleYear and uses n_distinct(ID) to count distinct ID values, representing unique coral observations. The resulting data frame, unique_coral_counts, contains SampleYear and total_unique_corals.

    Next, the SampleYear column is converted from a factor to numeric using mutate. This ensures the year values are treated as continuous numbers for analyses or visualizations that require numerical formatting.

unique_coral_counts <- clionaext %>%
  group_by(SampleYear) %>%
  summarise(total_unique_corals = n_distinct(ID)) 

unique_coral_counts <- unique_coral_counts%>%
  mutate(SampleYear = as.numeric(as.character(SampleYear)))
  1. This code creates a new data frame, ext_time_with_counts, by merging (left_join) the ext_time data frame with the unique_coral_counts data frame. The join is performed using the SampleYear column, which is common to both data frames.
ext_time_with_counts <- ext_time %>%
  left_join(unique_coral_counts, by = "SampleYear")
  1. This code processes the ext_time_with_counts data frame to summarize sponge coverage by year. It groups the data by SampleYear and calculates the total extent of sponge coverage (total_extent), retrieves the total unique coral count for each year (count_observations), and computes the weighted average extent (weighted_avg_extent) by dividing the total extent by the unique coral count.
weighted_summary <- ext_time_with_counts %>%
  group_by(SampleYear) %>%
  summarise(
    total_extent = sum(Extent, na.rm = TRUE),
    count_observations = first(total_unique_corals),  # Use `first` to avoid altering the count
    weighted_avg_extent = total_extent / count_observations
  )
  1. This code creates a line plot using the weighted_summary data frame to visualize the weighted average extent of Cliona sponge coverage over time. It maps SampleYear to the x-axis and weighted_avg_extent to the y-axis, adding a blue line and points to highlight the data, along with a minimal theme and descriptive labels for the title, x-axis, and y-axis.
ggplot(weighted_summary, aes(x = SampleYear, y = weighted_avg_extent)) +
  geom_line(color = "blue") +
  geom_point(size = 2) +
  labs(
    title = "Weighted Average Cliona Sponge Coverage Over Time",
    x = "Year",
    y = "Weighted Average Extent of Coverage"
  ) +
  theme_minimal()

Next I ask the question: How does the extent of coverage differ between coral species each year? Are there species that experience fluctuating or consistent coverage levels over time?

  1. This code creates a summary data frame, spp_summary, by grouping the ext_time_with_counts data frame by SampleYear and SPP (coral species). For each group, it calculates the total sponge coverage extent (total_extent), retrieves the total unique coral count for the year (count_observations), and computes the weighted average sponge extent (weighted_avg_extent) by dividing the total extent by the unique coral count. This provides a standardized metric of sponge coverage for each coral species across years.
spp_summary <- ext_time_with_counts %>%
  group_by(SampleYear, SPP) %>%
  summarise(
    total_extent = sum(Extent, na.rm = TRUE),
    count_observations = first(total_unique_corals),
    weighted_avg_extent = total_extent / count_observations
  )
  1. This code creates a line plot using the spp_summary data frame to visualize the weighted average extent of Cliona sponge coverage over time for each coral species (SPP). The x-axis represents SampleYear, the y-axis represents weighted_avg_extent, and different coral species are distinguished by color.
ggplot(spp_summary, aes(x = SampleYear, y = weighted_avg_extent, color = SPP)) +
  geom_line() +
  geom_point() +
  labs(
    title = "Weighted Average Cliona Sponge Coverage by Coral Species Over Time",
    x = "Year",
    y = "Weighted Average Extent of Coverage"
  ) +
  theme_minimal()

Spencer Parr

Substrate Preference

Ivlev’s Electivity Index

The Ivlev’s Electivity Index (Ei) is a quantitative measure used to determine an organism’s preference for certain food or habitat types relative to their availability. In this thesis, it is applied to assess the Cliona delitrix sponge’s substrate preferences on coral reefs. Specifically, the index compares the proportion of coral species colonized by Cliona to the proportion of those species available in the environment. A positive Ei value indicates a preference, while negative values suggest avoidance. This helps in understanding how Cliona sponges select coral species under different environmental conditions.

Electivity indices measure the utilization of food types (r) in relation to their abundance or availability in the environment (p). Foods that constitute a larger proportion of the diet than of the available foods are considered preferred; conversely those proportionately underrepresented in the diet are avoided.

Figure 1: Prey selectivity according to Ivlev’s selectivity index of newly emerged brown trout fry in the River Iso (NW Spain). Modified from Elías, Jacinto & Díaz et al. (2013)

While Ivlev’s Electivity Index is typically used for dietary preferences, it can still be effectively applied to Cliona delitrix sponges in this thesis, despite corals not being a food source. Cliona sponges exhibit substrate preferences for certain coral species based on factors such as calcium carbonate concentration. Some stony corals have denser calcium carbonate skeletons, which may be harder for Cliona to excavate and colonize. Thus, the sponge’s choice is driven by its ability to grow and obtain necessary nutrients, similar to dietary selectivity in food webs.

  1. Coral Abundance by Species and Location:
  • The code first calculates how many times each coral species appears at each location (site). This gives a general sense of coral distribution across different locations, providing a simple count of occurrences per species at each site.
  1. Coral Abundance by Species, Location, Transect, and Year:
  • The second part refines the calculation to be more detailed by including both transects (smaller sampling areas within locations) and sample years. This breakdown gives the number of occurrences for each species not only by location but also by transect and year, making it possible to track coral abundance more precisely across both space and time.
# Calculate coral abundance by transect and year
coral_abundance <- cliona %>%
  group_by(SPP, Location, Transect, SampleYear) %>%
  summarise(Abundance = n()) %>%
  ungroup()

clionaclean1 <- cliona %>%
  left_join(coral_abundance, by = c("SPP", "Location", "Transect", "SampleYear"))
  1. Summarizing Total Presence and Abundance:
  • The code calculates two key metrics for each coral species (SPP). The code creates a summary of each coral species, providing an overall count of how often each species was present and its total abundance across the entire dataset. This allows for a better understanding of the relative presence and prevalence of different coral species in the study area.
# Summarize data to calculate total presence and abundance of each coral species
summary_cliona <- clionaclean1 %>%
  group_by(SPP) %>%
  summarise(TotalPresence = sum(presence), 
            TotalAbundance = sum(Abundance)) %>%  # Adjust for coral abundance
  ungroup()
  • This code calculates the proportions of presence and availability for each coral species and then computes Ivlev’s Electivity Index to determine whether species are disproportionately present relative to their availability in the environment.
# Calculate proportions and then Ivlev's Electivity Index adjusted for coral abundance
 Ivlev_Index <- summary_cliona %>%
   mutate(ProportionPresence = TotalPresence / sum(TotalPresence),
         ProportionAvailability = TotalAbundance / sum(TotalAbundance),  # Use abundance
          IvlevIndex = (ProportionPresence - ProportionAvailability) /
            (ProportionPresence + ProportionAvailability))

Ivlev_Index %>%
  kable(caption = "Ivlev's Electivity Index Adjusted for Coral Abundance", digits = 4)
Ivlev’s Electivity Index Adjusted for Coral Abundance
SPP TotalPresence TotalAbundance ProportionPresence ProportionAvailability IvlevIndex
AA 72 41794 0.0378 0.0618 -0.2408
AC 0 21 0.0000 0.0000 -1.0000
AF 0 229 0.0000 0.0003 -1.0000
AG 9 1142 0.0047 0.0017 0.4736
AGSP 3 17256 0.0016 0.0255 -0.8837
AH 9 10869 0.0047 0.0161 -0.5455
AL 29 9780 0.0152 0.0145 0.0259
AP 0 4 0.0000 0.0000 -1.0000
AT 0 13 0.0000 0.0000 -1.0000
AU 0 41 0.0000 0.0001 -1.0000
CN 7 403 0.0037 0.0006 0.7210
DCY 0 60 0.0000 0.0001 -1.0000
DL 7 365 0.0037 0.0005 0.7440
DSO 1 58 0.0005 0.0001 0.7193
EF 0 232 0.0000 0.0003 -1.0000
FF 0 17 0.0000 0.0000 -1.0000
HC 0 1355 0.0000 0.0020 -1.0000
IR 0 10 0.0000 0.0000 -1.0000
IS 0 4 0.0000 0.0000 -1.0000
MAFO 0 62 0.0000 0.0001 -1.0000
MAL 0 15 0.0000 0.0000 -1.0000
MAN 0 6 0.0000 0.0000 -1.0000
MAR 0 25 0.0000 0.0000 -1.0000
MC 87 7315 0.0457 0.0108 0.6172
MD 11 2743 0.0058 0.0041 0.1751
MDA 0 3 0.0000 0.0000 -1.0000
MDSP 0 33 0.0000 0.0000 -1.0000
MF 1 24 0.0005 0.0000 0.8734
MILA 8 4438 0.0042 0.0066 -0.2193
MILC 0 702 0.0000 0.0010 -1.0000
MILSP 0 4 0.0000 0.0000 -1.0000
ML 0 9 0.0000 0.0000 -1.0000
MM 0 317 0.0000 0.0005 -1.0000
MME 0 793 0.0000 0.0012 -1.0000
MP 0 51 0.0000 0.0001 -1.0000
MYSP 0 76 0.0000 0.0001 -1.0000
OA 396 26709 0.2080 0.0395 0.6808
OFAV 81 5270 0.0425 0.0078 0.6904
OFRA 166 78250 0.0872 0.1157 -0.1406
OX 107 38927 0.0562 0.0576 -0.0120
PA 304 224707 0.1597 0.3323 -0.3509
PB 0 1 0.0000 0.0000 -1.0000
PBSP 2 1460 0.0011 0.0022 -0.3454
PC 0 116 0.0000 0.0002 -1.0000
PCL 4 91 0.0021 0.0001 0.8796
PD 0 413 0.0000 0.0006 -1.0000
PF 1 435 0.0005 0.0006 -0.1010
PP 52 107932 0.0273 0.1596 -0.7078
PS 26 1128 0.0137 0.0017 0.7823
SB 0 8 0.0000 0.0000 -1.0000
SC 0 48 0.0000 0.0001 -1.0000
SCSP 0 53 0.0000 0.0001 -1.0000
SI 76 3601 0.0399 0.0053 0.7646
SL 0 1 0.0000 0.0000 -1.0000
SR 1 5785 0.0005 0.0086 -0.8843
SS 444 81086 0.2332 0.1199 0.3209
SSPP 0 1 0.0000 0.0000 -1.0000
TC 0 1 0.0000 0.0000 -1.0000
UNK 0 1 0.0000 0.0000 -1.0000
  1. The resulting graph shows a clear separation at zero, with bars above zero (blue) indicating a preference for certain coral species and bars below zero (red) indicating avoidance. This pattern reveals that Cliona generally avoids many coral species, with only a few species showing a positive electivity index, suggesting that Cliona selectively colonizes specific corals. This insight helps identify coral species that are more vulnerable to Cliona colonization, which is important for managing coral health and understanding Cliona’s ecological impact on reef systems.
ggplot(Ivlev_Index, aes(x = reorder(SPP, IvlevIndex), y = IvlevIndex, fill = IvlevIndex > 0)) +
  geom_bar(stat = "identity", width = 0.8) +
  labs(title = "Ivlev's Electivity Index for Cliona on Different Coral Species",
       x = "Coral Species",
       y = "Ivlev's Electivity Index") +
  theme_minimal(base_size = 14) +
  scale_fill_manual(values = c("red", "steelblue"), 
                    name = "Preference",
                    labels = c("Avoidance", "Preference")) +
  theme(axis.text.x = element_text(angle = 45, hjust = 1, vjust = 1, size = 10),  # Adjust angle and font size
        plot.title = element_text(hjust = 0.5, size = 16, face = "bold"),  # Center and bold title
        legend.position = "top") +  # Move legend to the top
  geom_hline(yintercept = 0, linetype = "dashed", color = "black", size = 0.5)  # Add horizontal line at 0

The bar graph shows Ivlev’s Electivity Index for Cliona sponge prevalence on various coral species, quantifying the degree of preference (positive values) or avoidance (negative values) exhibited by Cliona. Coral species are displayed along the x-axis, with the y-axis representing the electivity index, ranging from -1 (complete avoidance) to 1 (complete preference). Most coral species show a strong avoidance by Cliona, evidenced by a majority of red bars with negative values. Notably, species such as MFCL, PS, and DL exhibit the highest positive electivity indices, indicating a clear preference for colonization. This suggests that these coral species may provide favorable conditions for Cliona, such as structural complexity or susceptibility to colonization. Conversely, species such as AC, AF, and AT demonstrate the most significant avoidance, with values near -1. This information suggests that there is selective colonization behavior of Cliona, which may have important implications for understanding its ecological interactions and impacts on coral reef ecosystems.

Spencer Parr

Site Map

Map 1: Spatial distribution of Cliona sponge prevalence across TCRMP monitoring sites in the U.S. Virgin Islands. Point size represents the mean prevalence percentage of Cliona sponges at each site, with larger points indicating higher prevalence. Colors depict reef complex classifications: nearshore (light green), offshore (medium green), and mesophotic (dark green). This visualization highlights variation in Cliona prevalence and reef complex types across the region.
Map 1: Spatial distribution of Cliona sponge prevalence across TCRMP monitoring sites in the U.S. Virgin Islands. Point size represents the mean prevalence percentage of Cliona sponges at each site, with larger points indicating higher prevalence. Colors depict reef complex classifications: nearshore (light green), offshore (medium green), and mesophotic (dark green). This visualization highlights variation in Cliona prevalence and reef complex types across the region.